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1. A method of indexing multimedia documents, the method 
being characterized in that it comprises the following steps: 
5 a) for each document, identifying and extracting terms t L 

constituted by vectors characterizing properties of the 
multimedia document for indexing, such as shape, texture, 
color, or structure of an image, the energy, the oscillation 
rate or frequency information of an audio signal, or a group 
10 of characters of a text; 

b) storing the terms t ± characterizing the properties of 
the multimedia document in a term base comprising P terms; 

c) determining a maximum number N of desired concepts 
combining the most pertinent terms t ± , where N is an integer 

15 less than P, with each concept c ± being designed to combine all 
terms that are neighboring from the point of view of their 
characteristics ; 

d) calculating the matrix T of distances between the 
terms t ± of the term base; 

2 0 e) decomposing the set P of terms t ± of the term base into 

N portions Pj (1 < j < N) such that P = P x u P 2 ... ^ Pj ... u P N , 
each portion Pj comprising a set of terms t ±j and being 
represented by a concept c j7 the terms t ± being distributed in 
such a manner that terms that are farther away are to be found 
25 in distinct portions P lf P m while terms that are closer 
together are to be found in the same portion P x ; 

f) structuring a concept dictionary so as to constitute a 
binary tree in which the leaves contain the concepts c ± of the 
dictionary and the nodes of the tree contain the information 

3 0 necessary for scanning the tree during a stage of identifying 

a document by comparing it with previously- indexed documents; 
and 

g) constructing a fingerprint base made up of the set of 
concepts c ± representing the terms t ± of the documents to be 

35 indexed, each document being associated with a fingerprint 
that is specific thereto. 
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2. An indexing method according to claim 1 # characterized in 
that each concept c ± of the fingerprint base is associated with 
a data set comprising the number of terms No. T in the 

5 documents in which the concept c ± is present. 

3. An indexing method according to claim 1 or claim 2, 
characterized in that for each document in which a concept Ci 
is present, a fingerprint of the concept c ± is registered in 
the document, said fingerprint containing the frequency with 
which the concept c ± occurs, the identities of concepts 
neighboring the concept c ± in the document, and a score which 
is a mean value of similarity measurements between the concept 
c ± and the terms t ± of the document that are the closest to the 
concept c ± . 

4. An indexing method according to any one of claims 1 to 3 , 
characterized in that it comprises a step of optimizing the 
partitioning of the set P of terms of the term base to 
decompose said set P into M classes C ± (1 < i < M, where M < 
P) , so as to reduce the distribution error of the set P of 
terms in the term base into N portions [P lt P 2 , P N ) where 
each portion P ± is represented by the term t ± that is taken as 
the concept c if the error that is committed e being such that 6 

N 

S e t. where £ t< = ^]d 2 (t if t j ) is the error committed by 

i=l 1 1 tjePi 

replacing the terms tj of a portion Pi with t ± . 

5. An indexing method according to claim 4, characterized in 
30 that it comprises the following steps: 

i) decomposing the set P of terms into two portions P x and 

p 2 ; 
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ii) determining the two terms t t and tj of the set P that 
are the furthest apart, corresponding to the greatest distance 
D i;j of the distance matrix T; 

iii) for each term t k of the set P, examining to see 
5 whether the distance D ki between the term t k of the term t ± is 

less that the distance D kj between the term t k and the term t j , 
and if so, allocating the term t k to the portion P x , and 
otherwise allocating the term t k to the portion P 2 ; and 

iv) iterating step i) until the desired number N of 
10 portions P ± has been obtained, and on each iteration applying 

the steps ii) and iii) on the terms of the portions P x and P 2 . 

6. An indexing method according to claim 4 or claim 5, 
characterized in that it includes optimization starting from N 
15 disjoint portions {P 17 P 2 , P N } of the set P and N terms {t 1# 

t 2 , t N } representing them in order to reduce the 

decomposition error of the set P into N portions, and in that 
it comprises the following steps: 

i) calculating the centers of gravity C ± of the portions 

20 P ± ; 

ii) calculating errors eCi = ^d^C^tj) and sti = ^d 2 (t. / t j ) 

tjePi tjePi 

when replacing the terms tj of the portion P ± respectively by 
C ± and by t i# - 

iii) comparing et ± and sc ± and replacing t ± by C ± if ec ± < 

2 5 8t ± ; and 

iv) calculating a new distance matrix T between the terms 
t ± of the term base and the process of decomposing the set P of 
terms of the term base into N portions, unless a stop 
condition is satisfied with 

3 0 gc t - gc t+i K thresho i d/ 

ec t 

where sc t represents the error committed at instant t. 
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7. An indexing method according to any one of claims 1 to 6, 
characterized in that for the purpose of structuring the 
concept dictionary, a navigation chart is produced iteratively 
on each iteration, beginning by splitting the set of concepts 

5 into two subsets, and then selecting one subset on each 
iteration until the desired number of groups is obtained or 
until a stop criterion is satisfied. 

8. An indexing method according to claim 7, characterized in 
10 that the stop criterion is constituted by the fact that the 

subsets obtained are all homogeneous with small standard 
deviation. 

9. An indexing method according to claim 7 or claim 8, 
15 characterized in that during the structuring of the concept 

dictionary, navigation indicators are determined from a matrix 
M = [c lt c 2 , c N ] <= <R P * N of the set C of concepts c ± e 9? p 

where c ± represents a concept of p values, by implementing the 
following steps: 
2 0 i) calculating a representative w of the matrix M; 

ii) calculating the covariance matrix M between the 
elements of the matrix M and the representative w of the 
matrix M; 

iii) calculating a projection axis u for projecting the 
25 elements of the matrix M; 

iv) calculating the value pi = d(u,Ci) - d{u,w) and 
decomposing the set of concepts C into two subsets CI and C2 
as follows: 

c t e CI if pi < 0 
c ± e C2 if pi > 0 

30 v) storing the information {u, w, |pl|, p2 } in the node 

associated with C, where pi is the maximum of all pi < 0 and 
p2 is the minimum of all pi > 0, the data set {u, w, |pl|/ p2} 
constituting the navigation indicators in the concept 
dictionary. 



-58- 



10. An indexing method according to any one of claims 1 to 9, 
characterized in that both the structural components and the 
complements of said structural components constituted by the 

5 textural components of an image of the document are analyzed, 
and in that : 

a) while analyzing the structural components of the 
image : 

al) boundary zones of the image structures are 
10 distributed into different classes depending on the 
orientation of the local variation in intensity so as to 
define structural support elements (SSEs) of the image; and 

a2) performing statistical analysis to construct 
terms constituted by vectors describing the local properties 
15 and the global properties of the structural support elements; 

b) while analyzing the textural components of the image: 

bl) detecting and performing parametric 

characterization of a purely random component of the image; 

b2) detecting and performing parametric 

20 characterization of a periodic component of the image; and 

b3) detecting and performing parametric 

characterization of a directional component of the image; 

c) grouping the set of descriptive elements of the image 
in a limited number of concepts constituted firstly by the 

25 terms describing the local and global properties of structural 
support element and secondly by the parameters of the 
parametric characterizations of the random, periodic, and 
directional components defining the textural components of the 
image ; and 

30 d) for each document, defining a fingerprint from the 

occurrences, the positions, and the frequencies of said 
concepts . 

0 

11. An indexing method according to claim 10, characterized 
3 5 in that the local properties of the structural support 

elements taken into consideration for constructing terms 
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comprise at least the support types selected from amongst a 
linear strip or a curved arc, the length and width dimensions 
of the support, the main direction of the support, and the 
shape and the statistical properties of the pixels 
5 constituting the support. 

12. An indexing method according to claim 10 or claim 11, 
characterized in that the global properties of the structural 
support element taken into account for constructing terms 

10 comprise at least the number of each type of support and the 
spatial disposition thereof. 

13 . An indexing method according to any one of claims 10 to 
12, characterized in that during analysis of the structural 

15 components of the image, a prior test is performed to detect 
whether at least one structure is present in the image, and in 
the absence of any structure, the method passes directly to 
the step of analyzing the textural components of the image. 



20 14. An indexing method according to any one of claims 10 to 
13, characterized in that in order to decompose boundary zones 
of the image structures into different classes, starting from 
the digitized image defined by the set of pixels y(i,j) where 
(i,j) g I x J, where I and J designate respectively the number 

25 of rows and the number of columns of the image, the vertical 
gradient image g v (i,j) where (i,j) € I x J and the horizontal 
gradient image g h (i,j) with (i,j) e I x J are calculated, and 
the image is partitioned depending on the local orientation of 
its gradient into a finite number of equidistant classes, the 

3 0 image containing the orientation of the gradient being defined 
by the equation: 

-X Tgh(i,j) 

0(1,3) = arc tan 

_gv(i,j)_ 

the classes constituting support regions likely to contain 
significant support elements are identified, and on the basis 
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of the support regions, significant support elements are 
determined and indexed using predetermined criteria. 

15. An indexing method according to any one of claims 1 to 9, 
5 characterized, in that while indexing a multimedia document 

comprising video signals, terms t ± are selected that are 
constituted by key- images representing groups of consecutive 
homogeneous images, and concepts c ± are determined by grouping 
together terms t ± . 

10 

16. An indexing method according to claim 15, characterized 
in that in order to determine key- images constituting terms t it 
a score vector SV is initially generated comprising a set of 
elements SV(i) representative of the difference or similarity 

15 between the content of an image of index i and the content of 
an image of index i-1, and the score vector SV is analyzed in 
order to determine key- images which correspond to maximums of 
the values of the elements SV(i) of the score vector SV. 

20 17. An indexing method according to claim 16, characterized 
in that an. image of index j is considered as being a key- image 
if the value SV(j) of the corresponding element of the score 
vector SV is a maximum and the value SV(j) is situated between 
two minimums minL and minR, and if the minimum Ml such that Ml 

25 = (|VS (j) - minL|, |SV (j) - minR | ) is greater than a given 
threshold. 

18. An indexing method according to any one of claims 1 to 9, 
characterized in that while indexing a multimedia document 
30 comprising audio components, the document is sampled and 
decomposed into frames, which frames are subsequently grouped 
together into clips each being characterized by a term t ± 
constituted by a parameter vector. 
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19. An indexing method according to claim 18 , characterized 
in that a frame comprises about 512 samples to about 2,048 
samples of the sampled audio document. 

5 20. An indexing method according to claim 18 or claim 19, 
characterized in that the parameters taken into account to 
define the terms t ± comprise time information corresponding to 
at least one of the following parameters: the energy of the 
audio signal frames, the standard deviation of frame energies 

10 in the clips, the sound variation ratio, the low energy ratio, 
the rate of oscillation about a predetermined value, the high 
rate of oscillation about a predetermined value, the 
difference between the number of oscillation rates above and 
below the mean oscillation rate for the frames of the clips, 

15 the variance of the oscillation rate, the ratio of silent 
frames . 

21. An indexing method according to any one of claims 18 to 

20, characterized in that the parameters taken into account 

2 0 for defining the terms t ± comprise frequency information 

corresponding to at least one of the following parameters: the 
center of gravity of the frequency spectrum of the short 
Fourier transform of the audio signal, the bandwidth of the 
audio signal, the ratio between the energy in a frequency band 
25 to the total energy in the entire frequency band of the 
sampled audio signal, the mean value of spectrum variation of 
two adjacent frames in a clip, the cutoff frequency of a clip. 

22. An indexing method according to any one of claims 18 to 

3 0 21, characterized in that the parameters taken into account 

for defining the terms t ± comprise at least energy modulation 
at 4 Hz. 

23. An indexing method according to any one of claims 1 to 
35 14, characterized in that the shapes of an image of a document 

are analyzed using the following steps: 
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a) performing multiresolution followed by decimation of 
the image ; 

b) defining the image in polar logarithmic space; 

c) representing the query image or image portion by its 
Fourier transform H; 

d) characterizing the Fourier transform H as follows: 

dl) projecting H in a plurality of directions to 
obtain a set of vectors of dimension equal to the projection 
movement dimension; and 

d2) calculating the statistical properties of each 
projection vector; and 

e) representing the shape of the image by a term t ± 
constituted by values for the statistical properties of each 
projection vector. 
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